Breast cancer is one of the most common cancer in women around the world. For diagnosis, pathologists evaluate biomarkers such as HER2 protein using immunohistochemistry over tissue extracted by a biopsy. Through microscopic inspection, this assessment estimates the intensity and integrity of the membrane cells' staining and scores the sample as 0, 1+, 2+, or 3+: a subjective decision that depends on the interpretation of the pathologist. This paper presents the preliminary data analysis of the annotations of three pathologists over the same set of samples obtained using 20x magnification and including $1,252$ non-overlapping biopsy patches. We evaluate the intra- and inter-expert variability achieving substantial and moderate agreement, respectively, according to Fleiss' Kappa coefficient, as a previous stage towards a generation of a HER2 breast cancer biopsy gold-standard using supervised learning from multiple pathologist annotations.
translated by 谷歌翻译
Erroneous correspondences between samples and their respective channel or target commonly arise in several real-world applications. For instance, whole-brain calcium imaging of freely moving organisms, multiple target tracking or multi-person contactless vital sign monitoring may be severely affected by mismatched sample-channel assignments. To systematically address this fundamental problem, we pose it as a signal reconstruction problem where we have lost correspondences between the samples and their respective channels. We show that under the assumption that the signals of interest admit a sparse representation over an overcomplete dictionary, unique signal recovery is possible. Our derivations reveal that the problem is equivalent to a structured unlabeled sensing problem without precise knowledge of the sensing matrix. Unfortunately, existing methods are neither robust to errors in the regressors nor do they exploit the structure of the problem. Therefore, we propose a novel robust two-step approach for the reconstruction of shuffled sparse signals. The performance and robustness of the proposed approach is illustrated in an application of whole-brain calcium imaging in computational neuroscience. The proposed framework can be generalized to sparse signal representations other than the ones considered in this work to be applied in a variety of real-world problems with imprecise measurement or channel assignment.
translated by 谷歌翻译
Most benchmarks for studying surgical interventions focus on a specific challenge instead of leveraging the intrinsic complementarity among different tasks. In this work, we present a new experimental framework towards holistic surgical scene understanding. First, we introduce the Phase, Step, Instrument, and Atomic Visual Action recognition (PSI-AVA) Dataset. PSI-AVA includes annotations for both long-term (Phase and Step recognition) and short-term reasoning (Instrument detection and novel Atomic Action recognition) in robot-assisted radical prostatectomy videos. Second, we present Transformers for Action, Phase, Instrument, and steps Recognition (TAPIR) as a strong baseline for surgical scene understanding. TAPIR leverages our dataset's multi-level annotations as it benefits from the learned representation on the instrument detection task to improve its classification capacity. Our experimental results in both PSI-AVA and other publicly available databases demonstrate the adequacy of our framework to spur future research on holistic surgical scene understanding.
translated by 谷歌翻译
Tourette Syndrome (TS) is a behavior disorder that onsets in childhood and is characterized by the expression of involuntary movements and sounds commonly referred to as tics. Behavioral therapy is the first-line treatment for patients with TS, and it helps patients raise awareness about tic occurrence as well as develop tic inhibition strategies. However, the limited availability of therapists and the difficulties for in-home follow up work limits its effectiveness. An automatic tic detection system that is easy to deploy could alleviate the difficulties of home-therapy by providing feedback to the patients while exercising tic awareness. In this work, we propose a novel architecture (T-Net) for automatic tic detection and classification from untrimmed videos. T-Net combines temporal detection and segmentation and operates on features that are interpretable to a clinician. We compare T-Net to several state-of-the-art systems working on deep features extracted from the raw videos and T-Net achieves comparable performance in terms of average precision while relying on interpretable features needed in clinical practice.
translated by 谷歌翻译
这项工作提出了一种新的方法,可以使用有效的鸟类视图表示和卷积神经网络在高速公路场景中预测车辆轨迹。使用基本的视觉表示,很容易将车辆位置,运动历史,道路配置和车辆相互作用轻松包含在预测模型中。 U-NET模型已被选为预测内核,以使用图像到图像回归方法生成场景的未来视觉表示。已经实施了一种方法来从生成的图形表示中提取车辆位置以实现子像素分辨率。该方法已通过预防数据集(一个板载传感器数据集)进行了培训和评估。已经评估了不同的网络配置和场景表示。这项研究发现,使用线性终端层和车辆的高斯表示,具有6个深度水平的U-NET是最佳性能配置。发现使用车道标记不会改善预测性能。平均预测误差为0.47和0.38米,对于纵向和横向坐标的最终预测误差分别为0.76和0.53米,预测轨迹长度为2.0秒。与基线方法相比,预测误差低至50%。
translated by 谷歌翻译
预训练的语言模型的目的是学习文本数据的上下文表示。预训练的语言模型已成为自然语言处理和代码建模的主流。使用探针,一种研究隐藏矢量空间的语言特性的技术,以前的作品表明,这些预训练的语言模型在其隐藏表示中编码简单的语言特性。但是,以前的工作都没有评估这些模型是否编码编程语言的整个语法结构。在本文中,我们证明了\ textit {句法子空间}的存在,该{语法子空间}位于预训练的语言模型的隐藏表示中,其中包含编程语言的句法信息。我们表明,可以从模型的表示形式中提取此子空间,并定义一种新颖的探测方法AST-Probe,该方法可以恢复输入代码段的整个抽象语法树(AST)。在我们的实验中,我们表明这种句法子空间存在于五个最先进的预训练的语言模型中。此外,我们强调说,模型的中间层是编码大多数AST信息的模型。最后,我们估计该句法子空间的最佳大小,并表明其尺寸大大低于模型的表示空间。这表明,预训练的语言模型使用其表示空间的一小部分来编码编程语言的句法信息。
translated by 谷歌翻译
与其他技术(例如电感回路,雷达或激光器)相比,使用摄像头进行车速测量的成本效益要高得多。但是,由于相机的固有局限性提供准确的范围估计值,因此准确的速度测量仍然是一个挑战。此外,基于经典的视觉方法对相机和道路之间的外部校准非常敏感。在这种情况下,使用数据驱动的方法是一种有趣的选择。但是,数据收集需要一个复杂且昂贵的设置,以在与高精度速度传感器同步的相机中录制视频,以生成地面真相速度值。最近已经证明,使用驾驶模拟器(例如Carla)可以用作生成大型合成数据集的强大替代方案,以实现对单个摄像机的车辆速度估算的应用。在本文中,我们在不同的虚拟位置和不同的外部参数中使用多个摄像机研究相同的问题。我们解决了复杂的3D-CNN体系结构是否能够使用单个模型隐式学习视图速度的问题,或者特定于视图的模型是否更合适。结果非常有前途,因为它们表明具有来自多个视图的数据报告的单个模型比摄像机特异性模型更好地准确性,从而铺平了迈向视图的车辆速度测量系统。
translated by 谷歌翻译
这项工作探讨了CFGAN的再现性。 CFGan及其模型(Tagrec,MTPR和CRGAN)学会通过使用先前的交互来为TOP-N建议者生成个性化和假的偏好排名。这项工作成功复制了原始纸张中发布的结果,并讨论了CFGAN框架与原始评估中使用的模型之间的某些差异的影响。没有随机噪声和使用真实用户配置文件作为条件向量离开发电机容易发生一个退化的解决方案,其中输出矢量与输入向量相同,因此,表现为简单的AutoEncoder。该工作进一步扩展了比较CFGAN对一系列简单且众所周知的适当优化的基线的实验分析,尽管计算成本高,但仍观察CFGAN并不一致地对抗它们。为确保这些分析的再现性,这项工作描述了实验方法,并发布了所有数据集和源代码。
translated by 谷歌翻译
已经表明,在一个域上训练的双编码器经常概括到其他域以获取检索任务。一种广泛的信念是,一个双编码器的瓶颈层,其中最终得分仅仅是查询向量和通道向量之间的点产品,它过于局限,使得双编码器是用于域外概括的有效检索模型。在本文中,我们通过缩放双编码器模型的大小{\ em同时保持固定的瓶颈嵌入尺寸固定的瓶颈的大小来挑战这一信念。令人惊讶的是,令人惊讶的是,缩放模型尺寸会对各种缩放提高检索任务,特别是对于域外泛化。实验结果表明,我们的双编码器,\ textbf {g} enovalizable \ textbf {t} eTrievers(gtr),优先级%colbert〜\ cite {khattab2020colbertt}和现有的稀疏和密集的索取Beir DataSet〜\ Cite {Thakur2021Beir}显着显着。最令人惊讶的是,我们的消融研究发现,GTR是非常数据的高效,因为它只需要10 \%MARCO监督数据,以实现最佳域的性能。所有GTR模型都在https://tfhub.dev/google/collections/gtr/1发布。
translated by 谷歌翻译
随着深度和卷积神经网络的发展,近年来,神经网络领域已经出现了重大进展。虽然目前的许多作品地址地址的实际型号,但最近的研究表明,具有超清印的参数的神经网络可以更好地捕获,概括并表示多维数据的复杂性。本文探讨了急性淋巴细胞白血病诊断急性淋巴细胞白血病的季屈节型卷积神经网络应用。精确地,我们比较了实值和四元值值卷积神经网络的性能,从外周血涂片微观图像分类淋巴细胞。四元值卷积的卷积神经网络比其相应的实值网络实现更好或类似的性能,但仅使用其参数的34%。该结果证实,四元数代数允许从具有较少参数的彩色图像捕获和提取信息。
translated by 谷歌翻译